CLI Applications and Regular Expressions
CIS 193 – Go Programming
Prakhar Bhandari, Adel Qalieh
CIS 193
Prakhar Bhandari, Adel Qalieh
CIS 193
A command-line application, also known as a command-line interface (CLI) application, are programs that are designed to be used from a text-interface such as a shell inside a terminal.
CLIs usually take in inputs as arguments and flags/switches through a text interface.
CLIs are extremely powerful as they can offer many more options and can be automated and chained together with scripting.
Ex: Command Prompt in Windows, Terminal with bash
in OSX and Linux.
Try: uni.xkcd.com/
To get a command line argument, use os.Args
, an array of the arguments.
The first argument is always the program name (os.Args[0]
).
package main import ( "fmt" "os" ) func main() { fmt.Printf("Hello, %s\n", os.Args[1]) }
To call this, we say
$ ./hello Adel Hello, Adel
A command line argument is the simplest way to collect data from the CLI. It is simply a []string
, so it is unstructured.
For example, to get the rest of the command line arguments, we can edit our program to:
func main() { fmt.Printf("Hello, %s\n", strings.Join(os.Args[1:], " ")) }
So, to compare the original to our new program, we have:
$ ./hello Adel Qalieh Hello, Adel $ ./hello2 Adel Qalieh Hello, Adel Qalieh
To build and install a CLI to your system, simply run go
install
. This will install the binary into your $GOPATH/bin
directory.
$ go install $ $GOPATH/bin/lec8 Prakhar Hello, Prakhar
And if your system PATH
is correctly configured with your $GOPATH/bin
, you should be able to call it directly:
$ lec8 Prakhar Hello, Prakhar
This makes it incredibly easy to make simple command line tools for your system.
What if you want structured data - numbers, booleans, required arguments, switches, etc?
Use the flag
package!
The flag
package supports basic CLI parsing. First add a flag
var times = flag.Int("times", 1, "number of times to print hello")
Then make sure to parse the flags with flag.Parse()
. Note that times
is an *int
!
Also note that all arguments must come after all flags, and are accessed with flag.Args
and flag.Arg(i)
func main() { flag.Parse() fmt.Println(*times) }
What if we want a type that is not defined in the flag
package? We can define custom flags by fulfilling the flag.Value
interface:
type Value interface { String() Set(string) error }
To get input from a user in a command line application while the program is running, we need to make use of standard input. The simplest way is a utility function from the fmt
package
var s string fmt.Scanln(&s)
However, to get buffered input, we can use the bufio
package in tandem with os.Stdin
reader := bufio.NewReader(os.Stdin) text, err := reader.ReadString('\n')
Finally, we can use the formatting verbs to get structured data from user input:
var i int _, err := fmt.Scanf("%d\n", &i)
When you run go
build
or go
install
, the binary is built for your particular operating system and processor.
To build a binary for another platform, set the GOOS
(operating system) and GOARCH
(processor architecture) environment variables when running your build commands.
The full list of environments is listed in the Go documentation
$ GOOS=linux GOARCH=386 go build test.go
Regular expressions (regex) are a way of matching or categorizing strings. They have their own arcane syntax but are used in a variety of contexts.
End goal: does this string match condition X? What part of the string matches? Can I extract the desired information from a block of text?
Go has an extremely robust and performant regular expression implementation that is unmatched by any other language regex implementation.
Most alphanumeric characters in regex will simply match that character. For example, the regex `go`
matches all of the following strings:
golang gopher google hugo
Note that regex is (usually) case sensitive, so it will not match Google
.
To define a regex in Go, always use raw strings using the backtick (`
) character
goRegex := `go`
What about non-verbatim cases, ie. any number or any letter? We use what are called "character classes", or a category of characters that fall under an umbrella. Here are some common ones:
\d
- Decimal digits (0-9)\s
- Whitespace characters\S
- Non-whitespace characters\w
- Alphanumeric characters (A-Z, a-z, and 0-9)\W
- Non-alphanumeric characters\b
- Word boundary (whitespace between words)The full regex syntax is described by RE2
There are also operators to match a variable number of characters. All operators act on the syntax immediately before:
+
- occurs 1 or more times*
- occurs 0 or more times?
- occurs 0 or 1 times{n}
- occurs exactly n
times{n,m}
- occurs between n
and m
times{n,}
- occurs at least n
timesa|b
- "or" operator, matches either the regex on the left or the right[abc]
- match specific characters that are within the brackets, equivalent to "a or b or c"[^abc]
- exclude specific characters from match, equivalent to "not (a or b or c)".
- "wildcard" character, matches any character
To match an actual period character .
escape it with a backslash: \.
To extract information, wrapping the desired information within parenthesis will put that match in a "capture group" which can be pulled out.
Example:
(IMG_\d+).jpg
will get filenames like IMG_629.jpg
but only the part without the extension in the capture group (IMG_629
)
Regex can be difficult to learn, easy to forget, and has much much more than we can cover in this class. Here are some resources if you want additional help.
Define a regexp with regexp.Compile()
, which returns a *Regexp
Use methods on *Regexp
:
MatchString(string)
bool
- reports whether the string matches the regexpFindAllString(string,
n
int)
[]string
- finds all matching strings in the expression, up to n matchesThe 16 methods for identifying matched text can be matched by the regular expression
Find(All)?(String)?(Submatch)?(Index)?
See the regexp documentation for full details on all the methods of *Regexp